The two most important R packages for manipulating spatial objects
are:
sf
vectors, points, lines, polygons, shapefiles and geopackages
terra
rasters or regularly gridded data
You may see some code or
package that still rely on the packages sp,
rgdal, rgeos or
raster, but these are now mostly retired and were
entirely replaced by sf and terra.
Another important package is stars for spatiotemporal
arrays and raster and vector data cubes, but we won’t talk about this
package in this workshop.
First, let’s look in the folder where the shapefile is located.
list.files("/home/frousseu/Documents/github/spatialR/data")
## [1] "carreteras.dbf" "carreteras.prj" "carreteras.sbn"
## [4] "carreteras.sbx" "carreteras.shp" "carreteras.shp.xml"
## [7] "carreteras.shx" "cat.txt" "cdem_dem_021E_tif"
## [10] "cdem_dem_021E.tif" "cdem_dem_031F_tif" "cdem_dem_031F_tif.zip"
## [13] "cdem_dem_031F.tif" "cdem_dem_031G_tif" "cdem_dem_031G_tif.zip"
## [16] "cdem_dem_031G.tif" "cdem_dem_031H_tif" "cdem_dem_031H.tif"
## [19] "rast.tif" "roads.dbf" "roads.gpkg"
## [22] "roads.prj" "roads.shp" "roads.shx"
## [25] "test.dbf" "test.gpkg" "test.prj"
## [28] "test.shp" "test.shx"
Reading shapefiles is done with the function
st_read.
library(sf)
roads <- st_read("/home/frousseu/Documents/github/spatialR/data/carreteras.shp")
## Reading layer `carreteras' from data source
## `/home/frousseu/Documents/github/spatialR/data/carreteras.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1171 features and 4 fields
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: 123569.7 ymin: 1963648 xmax: 309483.2 ymax: 2128266
## Projected CRS: WGS 84 / UTM zone 16N
This object can be plotted simply with the function plot
which has a method for sf objects. The st_geometry is used
to only plot the geometry of the object, but not the attributes.
plot(st_geometry(roads), axes = TRUE)
roads?class(roads)
## [1] "sf" "data.frame"
head(roads)
## Simple feature collection with 6 features and 4 fields
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: 269922.2 ymin: 1964531 xmax: 292506.3 ymax: 1976234
## Projected CRS: WGS 84 / UTM zone 16N
## CLASIF LENGTH DESCRIPCIO PAíS geometry
## 1 C4 747.849 Veredas Guatemala LINESTRING (270613.6 196999...
## 2 C4 1303.025 Veredas Belice LINESTRING (281424.3 197242...
## 3 C4 2055.814 Veredas Belice LINESTRING (292506.3 196456...
## 4 C4 1807.620 Veredas Belice LINESTRING (277670.8 196523...
## 5 C4 2661.517 Veredas Belice LINESTRING (272900.7 197484...
## 6 C4 547.114 Veredas Mexico LINESTRING (270909.1 197623...
It is mostly a data.frame to which we add a geometry
column.
data.framehead(as.data.frame(roads))
## CLASIF LENGTH DESCRIPCIO PAíS geometry
## 1 C4 747.849 Veredas Guatemala LINESTRING (270613.6 196999...
## 2 C4 1303.025 Veredas Belice LINESTRING (281424.3 197242...
## 3 C4 2055.814 Veredas Belice LINESTRING (292506.3 196456...
## 4 C4 1807.620 Veredas Belice LINESTRING (277670.8 196523...
## 5 C4 2661.517 Veredas Belice LINESTRING (272900.7 197484...
## 6 C4 547.114 Veredas Mexico LINESTRING (270909.1 197623...
plot(roads["CLASIF"], lwd = 4, axes = TRUE)
plot(roads, lwd = 4, axes = TRUE)
Often, we only want to plot the geometry to see if everything looks
OK. This is done with the st_geometry function.
plot(st_geometry(roads), lwd = 4, axes = TRUE)
Alternatively, you can use the mapview package for a more interactive plot.
library(mapview)
mapview(roads)
We can write in different formats. The geopackage format (.gpkg) is used more and more compared to the .shp format.
# As a shapefile
st_write(roads, dsn = "/home/frousseu/Documents/github/spatialR/data/roads.shp", append = FALSE) # append = FALSE overwrites
## Deleting layer `roads' using driver `ESRI Shapefile'
## Writing layer `roads' to data source
## `/home/frousseu/Documents/github/spatialR/data/roads.shp' using driver `ESRI Shapefile'
## Writing 1171 features with 4 fields and geometry type Line String.
# As a geopackage
st_write(roads, dsn = "/home/frousseu/Documents/github/spatialR/data/roads.gpkg", append = FALSE)
## Deleting layer `roads' using driver `GPKG'
## Writing layer `roads' to data source
## `/home/frousseu/Documents/github/spatialR/data/roads.gpkg' using driver `GPKG'
## Writing 1171 features with 4 fields and geometry type Line String.
list.files("/home/frousseu/Documents/github/spatialR/data")
## [1] "carreteras.dbf" "carreteras.prj" "carreteras.sbn"
## [4] "carreteras.sbx" "carreteras.shp" "carreteras.shp.xml"
## [7] "carreteras.shx" "cat.txt" "cdem_dem_021E_tif"
## [10] "cdem_dem_021E.tif" "cdem_dem_031F_tif" "cdem_dem_031F_tif.zip"
## [13] "cdem_dem_031F.tif" "cdem_dem_031G_tif" "cdem_dem_031G_tif.zip"
## [16] "cdem_dem_031G.tif" "cdem_dem_031H_tif" "cdem_dem_031H.tif"
## [19] "rast.tif" "roads.dbf" "roads.gpkg"
## [22] "roads.prj" "roads.shp" "roads.shx"
## [25] "test.dbf" "test.gpkg" "test.prj"
## [28] "test.shp" "test.shx"
cat <- read.table("/home/frousseu/Documents/github/spatialR/data/cat.txt", header = TRUE)
head(cat)
## X Y Attack
## 1 -89.34188 18.49952 0
## 2 -89.41309 18.47991 0
## 3 -89.26248 18.60019 1
## 4 -89.19920 18.59252 1
## 5 -89.30334 18.59921 1
## 6 -89.23133 18.59242 1
This is a simple data.frame with X and Y columns
representing longitudes and latitudes. The column Attack
contains whether a location was concerned by a jaguar attack or not.
To transform this data.frame to a spatial object, we
just have to do:
cat <- st_as_sf(cat, coords = c("X", "Y"), crs = 4326)
The object returned is now an sf data.frame. The code
4326 is a special id for the lat/lon WGS84 coordinate
reference system (crs).
Here is what it looks like.
plot(st_geometry(cat), col = adjustcolor(ifelse(cat$Attack == 0, "blue", "red"), 0.4), pch = 16,
cex = 1.5, axes = TRUE)
mapview(roads) + mapview(cat)
Build a spatial data.frame with 100 random locations
centered on the CEF conference, plot the results with R and
mapview and write the results to a shapefile or a
geopackage (.gpkg). Read the shapefile you wrote to make sure everything
worked well.
Hint: use the function runif or rnorm to
generate random values within a given range. For example:
lat <- rnorm(100, mean = 45, sd = 0.1) # 100 random values with a mean of 45 and an sd of 0.1
lon <- rnorm(100, mean = 72, sd = 0.1)
Here is a solution
quebec <- st_read("/home/frousseu/Documents/github/spatialR/québec.gpkg")
## Reading layer `québec' from data source
## `/home/frousseu/Documents/github/spatialR/québec.gpkg' using driver `GPKG'
## Simple feature collection with 1 feature and 11 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -80.27292 ymin: 44.9912 xmax: -57.10779 ymax: 63.68542
## Geodetic CRS: WGS 84
st_crs(quebec)
## Coordinate Reference System:
## User input: WGS 84
## wkt:
## GEOGCRS["WGS 84",
## ENSEMBLE["World Geodetic System 1984 ensemble",
## MEMBER["World Geodetic System 1984 (Transit)"],
## MEMBER["World Geodetic System 1984 (G730)"],
## MEMBER["World Geodetic System 1984 (G873)"],
## MEMBER["World Geodetic System 1984 (G1150)"],
## MEMBER["World Geodetic System 1984 (G1674)"],
## MEMBER["World Geodetic System 1984 (G1762)"],
## MEMBER["World Geodetic System 1984 (G2139)"],
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]],
## ENSEMBLEACCURACY[2.0]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## USAGE[
## SCOPE["Horizontal component of 3D system."],
## AREA["World."],
## BBOX[-90,-180,90,180]],
## ID["EPSG",4326]]
par(mfrow = c(1, 2))
plot(st_geometry(quebec), axes = TRUE)
quebec <- st_transform(quebec, 6622)
plot(st_geometry(quebec), axes = TRUE)
As always, when playing with spatial data, one needs to be aware of
coordinate reference systems (CRS). Assigning and transforming CRS in sf
can be done with the st_transform function.
First, let’s get some data from GBIF using the rgbif package. We will get occurrence data for a spring plant species, the Red Trillium (Trille rouge, Trillium erectum).
library(rgbif)
occs <- occ_search(scientificName = "Trillium erectum", hasCoordinate = TRUE, stateProvince = "Québec",
country = "CA", limit = 5000)
trille <- st_as_sf(occs$data, coords = c("decimalLongitude", "decimalLatitude"))
plot(st_geometry(trille), axes = TRUE)
st_crs(trille)
## Coordinate Reference System: NA
We haven’t assigned a crs to the sf object.
See https://epsg.io/ to find CRS
EPSG codes.
4326 - WGS 84 – WGS84 - World
Geodetic System 1984, used in GPS
32198 - NAD83 /
Quebec Lambert
26918 - NAD83 / UTM zone 18N
9820 - Lambert Azimuthal Equal Area
6622 - NAD83(CSRS)v2 / Quebec Lambert
etc.
Assigning a coordinate reference system can be done with the
functions st_crs or it can be done when transforming a
data.frame to a spatial data.frame with
st_as_sf.
st_crs(trille) <- 4326
st_crs(trille)
## Coordinate Reference System:
## User input: EPSG:4326
## wkt:
## GEOGCRS["WGS 84",
## ENSEMBLE["World Geodetic System 1984 ensemble",
## MEMBER["World Geodetic System 1984 (Transit)"],
## MEMBER["World Geodetic System 1984 (G730)"],
## MEMBER["World Geodetic System 1984 (G873)"],
## MEMBER["World Geodetic System 1984 (G1150)"],
## MEMBER["World Geodetic System 1984 (G1674)"],
## MEMBER["World Geodetic System 1984 (G1762)"],
## MEMBER["World Geodetic System 1984 (G2139)"],
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]],
## ENSEMBLEACCURACY[2.0]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## USAGE[
## SCOPE["Horizontal component of 3D system."],
## AREA["World."],
## BBOX[-90,-180,90,180]],
## ID["EPSG",4326]]
st_as_sftrille <- st_as_sf(occs$data, coords = c("decimalLongitude", "decimalLatitude"), crs = 4326)
plot(st_geometry(trille), axes = TRUE)
# 32198 NAD83 / Quebec Lambert
trille <- st_transform(trille, 32198)
plot(st_geometry(trille), axes = TRUE)
When plotting or working with multiple spatial objects, the same CRS needs to be used!
plot(st_geometry(quebec))
plot(st_geometry(trille), add = TRUE)
The geodata package
returns a SpatVector which we transform to a sf object.
The ms_simplify function from the rmapshaper package is very
useful to reduce the precision of polygons and maintaining the topology
(and adjacent polygons).
library(geodata)
canada <- gadm("GADM", country = "CAN", level = 1) |>
st_as_sf() # transform a SpatVector to an sf data.frame
library(rmapshaper)
canada <- ms_simplify(canada, 0.001) # simplify shape and keep 0.1% of points
Let’s use Canada to more easily see the differences between some CRS.
crs <- c(4326, 2958, 32198, 3995)
par(mfrow = c(2, 2), mar = c(2, 3, 2, 0))
for (i in 1:length(crs)) {
canada2 <- st_transform(canada, crs = crs[i])
plot(st_geometry(canada2), axes = TRUE, main = paste("epsg", crs[i]), cex.main = 0.7)
}
Geographic coordinates are coordinates on a sphere or on an ellipse, while projected coordinates are defined on a flat 2D surface. Usually, geographic coordinates are in latitudes/longitudes and projected coordinates are in meters. As seen earlier, this is important when doing certain spatial operations. All functions from package rgeos require that spatial objects are projected. Otherwise, functions often do not work properly because they assume that calculations are done on cartesian coordinates.
st_bbox(canada)
## xmin ymin xmax ymax
## -141.00687 41.67693 -52.63496 83.10833
st_bbox(st_transform(canada, crs[3]))
## xmin ymin xmax ymax
## -3754088.6 -146642.3 1182482.0 4557090.5
The EPSG code 4326 is equivalent to giving the
"+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"
CRS to the data.
Get some occurrences for the monarch butterfly (Danaus
plexippus) using rgbif and the occ_search function and
plot them for Quebec using the gadm function from the
geodata package. Use the UTM zone 18N CRS.
monarchs <- occ_search(scientificName = "Danaus plexippus", hasCoordinate = TRUE, stateProvince = "Québec",
country = "CA", limit = 5000)
canada <- gadm("GADM", country = "CAN", level = 1) |>
st_as_sf() # transform a SpatVector to an sf data.frame
monarchs <- st_as_sf(monarchs$data, coords = c("decimalLongitude", "decimalLatitude"), crs = 4326)
monarchs <- st_transform(monarchs, 32618)
qc <- canada[canada$NAME_1 == "Québec", ]
qc <- st_transform(qc, 32618)
plot(st_geometry(qc))
plot(st_geometry(monarchs), add = TRUE)
Most spatial operations are available in the package sf.
One of the requirement for functions to work well for spatial operations
in sf is that objects are in a projected coordinate
system (i.e. not in lat/lon 4326). This can be checked with
st_is_longlat.
st_is_longlat(trille)
## [1] FALSE
st_is_longlat(quebec)
## [1] FALSE
st_is_longlat(canada)
## [1] TRUE
The function over from package sp is used to determine whether different entities overlap. It returns a vector or a data.frame with the identities or the characteristics of overlapping elements.
outaouais <- st_read("/home/frousseu/Documents/github/spatialR/outaouais.gpkg")
## Reading layer `outaouais' from data source
## `/home/frousseu/Documents/github/spatialR/outaouais.gpkg' using driver `GPKG'
## Simple feature collection with 1 feature and 13 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -75.90876 ymin: 45.3727 xmax: -75.34527 ymax: 45.59912
## Geodetic CRS: WGS 84
trille <- st_transform(trille, st_crs(outaouais))
plot(st_geometry(outaouais), axes = TRUE)
plot(st_geometry(trille), add = TRUE)
mapview(outaouais) + mapview(trille)